Видео с ютуба Swe-Polybench Benchmark
Revolutionizing AI-Driven Software Development: SWE-PolyBench Benchmark
SWE-bench: The AI Coding Benchmark Every Dev Must Know
How “good” are AI coding agents really? | BENCHMARKS
SWE-Bench+: Enhanced Coding Benchmark for LLMs (October 2024)
SciCode, AssistantBench, CiteME and SWE-bench: Summer of Benchmarks
What do AI Benchmarks Actually Mean?! A Fast Breakdown (MMLU, SWE-bench, & More Explained)
SWE-fficiency: Benchmarking LLM Code Speedups
Interpreting SWE-bench Scores
Evolutionary AI Coders
AI Coding Agents Coming in HOT! 🥵
«Бенчмаркинг: вы делаете это неправильно» Айсылу Гринберг
Microbenchmarking with Google's Benchmark
Benchmarking the Modular Structural Analysis Algorithm
4x Code Performance with SIMD
#283 Тест: Последний экзамен человечества (HLE)
Этот урок научил меня, как проводить более качественные бенчмарки
The Science of Benchmarking Panel (NeurIPS 2025 Tutorial)